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Abstract 

We introduce a graphical framework for Bayesian inference that is sufficiently general to ac- 
commodate not just the standard case but also recent proposals for a theory of quantum Bayesian 
inference wherein one considers density operators rather than probability distributions as represen- 
tative of degrees of belief. The diagrammatic framework is stated in the graphical language of sym- 
metric monoidal categories and of compact structures and Frobenius structures therein, in which 
Bayesian inversion boils down to transposition with respect to an appropriate compact structure. 
We characterize classical Bayesian inference in terms of a graphical property and demonstrate that 
our approach eliminates some purely conventional elements that appear in common representations 
thereof, such as whether degrees of belief are represented by probabilities or entropic quantities. 
We also introduce a quantum-like calculus wherein the Frobenius structure is noncommutative and 
show that it can accommodate Leifer's calculus of 'conditional density operators'. The notion of 
conditional independence is also generalized to our graphical setting and we make some prelimi- 
nary connections to the theory of Bayesian networks. Finally, we demonstrate how to construct a 
graphical Bayesian calculus within any dagger compact category. 
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1 Introduction 

In this paper we introduce a graphical calculus and corresponding axiomatics in terms of monoidal cate- 
gories for a very general notion of Bayesian inference. It enables one to reason at a highly abstract level 
about theories more general than classical Bayesian inference, including a proposal for a theory of quan- 
tum Bayesian inference introduced by Leifer 1,29.1 and subsequently developed by Leifer and Poulin [3(3, 
and Leifer and Spekkens |[3T| . This theory is a noncommutative generalization of the theory of clas- 
sical Bayesian inference wherein one replaces probability distributions over a set of random variables 
by density operators over a set of systems, marginals by reduced density operators, and conditionals by 
positive operators for which the partial trace over the conditioned system yields the identity operator on 
the conditionning system. The graphical language exploits the two-dimensional diagrammatic represen- 
tation to distinguish givens and conclusions. Bayesian inversion is diagrammatic transposition in terms 
of the compact structures ll24l l2?l. Frobenius structures f9^ will be our vehicle for expressing notions 
such as conditionalization and relations of conditional independence. 'Classical' Bayesian inference 
is characterized in terms of a condition of commutativity for the Frobenius structure and therefore this 
structure is key to expressing Bayesian updating in the specific case of classical Bayesian inference. 

There already exists a graphical language for expressing general quantum physical processes on 
quantum states over a finite-dimensional Hilbert space, namely Selinger's CPM(FdHilb) [36|. As 
shown by Coecke, Paquette and Pavlovic, this together with certain kinds of Frobenius structures defines 
a language for classical stochastic processes as a special case llT3ll . However, a Bayesian inference need 
not correspond to a physically -realizable process taking initial quantum states to final states. Rather, it is 
a computational process, taking givens to conclusions. Therefore, such inferences cannot in general be 
expressed within CPM (FdHilb) and the present work provides the extension of the graphical language 
that is required to accommodate Bayesian inference. 

An abstract representation of Bayesian inference allows one to identify which aspects of the standard 
probability calculus are merely conventional. For instance, in the context of R. T. Cox's derivation of 
the rules of classical Bayesian inference lITTll . the standard assumption that one's degree of belief about 
a proposition a ought to be represented by a number p{a) between and 1 and that one multiplies a 
conditional probability with a marginal to get the joint probability, i.e. p{a,b) = p{a\b)p{b) is seen 
to be a consequence of a choice of convention. One could equally well represent this degree of belief 
by any bijective function of p{a) such as s{a) = — logp(a), in which case s{a,b) = s{a\b) + s{b) 
and one replaces the standard form of Bayes' rule, p{a\b) = p{b\a)p{a) / p{b) , with its "entropic" form 
s{a\b) = s{b\a) + s(a) — s{b). The abstract approach taken in this work finds a similar result and 
thereby contributes to the project of extracting the elements of Bayesian inference that are independent 
of convention. 

Our graphical representation of Bayesian inference is also likely to have a close connection with 
the theory of Bayesian networks, and therefore may shed light on quantum analogues of these f30l. 
This has practical interest in the field of quantum information theory as quantum analogues of belief 
propagation algorithms are a natural avenue to quantum error correction schemes. As an example of 
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this connection, the quantum analogue of Bayes' rule has the same form as the approximate reversal 
channel of Bamum and Knill [6J. Furthermore, given that Bayesian networks provide a powerful tool 
for inferring something about the causal relations that hold among propositions from the relations of 
conditional independence that exist in their correlations ||34ll . we also hope that our graphical calculus 
might ultimately help to infer causal relations from quantum correlations and shed light on the quantum 
violation of Bell's notion of local causality. 

Finally, there has been a great deal of interest recently about general probabilistic theories that 
are distinct from both classical probability theory and quantum theory, e.g. (USKH. By considering 
a broad landscape of theories, one can hope to identify which aspects of quantum theory are shared 
with all operational probabilistic theories and which are unique to it. The framework we develop here 
provides a novel way of attacking this problem. By considering quantum theory as a theory of Bayesian 
inference, one is led to question which aspects of the theory are shared by all theories of Bayesian 
inference (insofar as one can define such a set) and which are unique to it. 



The logic of categorical graphical languages. A pedestrian introduction to the graphical calculi for 
symmetric monoidal categories is in |[T2l and a comprehensive survey on these kinds of results is in 
|[38l . These graphical languages trace back to Penrose's work in the early 70's. 

Compact categories show up in a range of areas of mathematical physics including knot theory and 
the Temperley-Lieb algebra BOl |431 and the theory of quantum groups |[39l . Dagger compact cate- 
gories have recently been exploited by Abramsky and Coecke in quantum information theory HU and in 
proposals for quantum gravity [2|. Frobenius structures trace back to Ferdinand Georg Frobenius' work 
on the representation theory of finite groups. They provide a very concise presentation of topological 
quantum field theories |[3l|26l, they provide a bridge between classical and linear logic ll32l . and allow 
diagrammatic axiomatization of quantum observables and C*-algebras |[T6ll42l . Similarly, they allow 
to distinguish between classical and quantum states [TT]. 

To know how much one can actually prove in a diagrammatic language one relies on the correspon- 
dence between graphical languages and certain kinds of monoidal categories, for example: 

Theorem 1.1 (Joyal-Street 1991 [23J). An equation follows from the axioms of symmetric monoidal 
categories if and only if it can be derived in the corresponding graphical language. 

Theorem 1.2 (Kelly-Laplaza 1980, Selinger 2007 ll25l |36]|). An equation follows from the axioms of 
(dagger) compact categories if and only if it can be derived in the corresponding graphical language. 

If one knows to which categorical structure a certain graphical calculus corresponds, one may ask 
the question wether there exist complete models of thesej^We are aware of two results of this nature: 

Theorem 1.3 (Hasegawa-Hofmann-Plotkin 2008 [21 1). An equation follows from the axioms of traced 
monoidal categories if and only if it holds infinite dimensional vector spaces. 

Theorem 1.4 (Selinger 2010 |I37I|). An equation follows from the axioms of dagger compact categories 
if and only if it holds infinite dimensional Hilbert spaces. 



Theorem 1 .4 is a highly surprising and powerful theorem. When paired with Theorem 1 .2 it implies 
that an important set of equational statements in quantum theory holds if and only if it can be derived in 
the graphical calculus. This result is moreover not only relevant for quantum mechanics related theories, 
but also classical probabilistic ones, since the latter can be represented in the category of Hilbert spaces. 



'That is, which enable to embed the coiTesponding free such categories, and hence, which are such that an equational 
statement holds in all models if and only if it is a consequence of the axioms of the categorical structure. 
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linear maps and the tensor product by means of Frobenius structures fT3\ . Unfortunately, there are no 
completeness results yet of the above kind directly involving Frobenius structures. 

Results like Theorem 1 .4 are obviously also important in the context of automated reasoning, and 
important steps towards automated reasoning with compact structures and Frobenius structures have 
already been made |[T8l[T9l . The developments in this paper make these tools available to the study of 
(generalized) Bayesian inference. 

Given the importance of dagger compact categories in the light of Theorem 1 .4 we construct a class 
of theories that generalize quantum Bayesian inference to any dagger compact category, and for which 
the concrete non-commutative Frobenius structures arise from the underlying commutative compact 
structures. 



Structure of the paper. In Section [2] we review compact structures, compact categories and dagger 
compact categories, dagger Frobenius structures therein, the interaction of the latter with compact struc- 
tures, and the graphical calculus of all of these. In Section |3] we define general Bayesian graphical 
calculi, and also define the restricted case of classical Bayesian graphical calculi. We provide an ex- 
ample of a classical Bayesian graphical calculus and show that it is in fact canonical. We show how 
entropies provide a model of classical Bayesian graphical calculus. We also provide an example of 
a non-classical one, namely the one introduced by Leifer ll29l [30l . to illustrate the generality of the 
framework. Typically, while for classical Bayesian graphical calculi the Frobenius multiplications will 
act commutatively, for non-classical ones it will act non-commutatively. We also observe that the key 
structural component of Bayesian graphical calculi, the Frobenius comultiplication, is in fact a logi- 
cal broadcasting operation (a map from one object to a pair of these such that the final state has both 
marginals equal to the initial state). Section [4]relies on compact structures to obtain a graphical presen- 
tation where givens are inputs and conclusions are outputs. In this setting, Bayesian inversion is nothing 
but transposition, and looks as follows: 




In Section [5] we define and study the important concept of conditional independence. We provide a 
simple example of how an assumption of conditional independence leads to a generalized formula for 
pooling multiple states of belief, and briefly discuss the semi-graphoid axioms and the connection be- 
tween our work and Bayesian networks. Finally, in Section [6j we provide an explicit construction for 
any dagger compact category of a non-commutative Frobenius multiplication, of which composition 
of density operators is a special case. This also gives an explicit presentation of the non-completely 
positive Frobenius comultiplication which plays the role of a logical broadcasting operation. 

2 Background: dagger Frobenius structures and compact structures 

In this paper we work within the graphical language of symmetric monoidal categories (SMCs). In such 
a graphical calculus associativity and unit natural isomorphisms are always strict, that is: 

{A B) (S, C = A (S, {B C) and A = A = I (S) A . (1) 

General morphisms (or operations) f : A ^ B, which we interpret as 'processes', are represented as: 

^ (2) 
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while points (or elements) e :! ^ A, which we interpret as 'states', are represented as: 

^ (3) 

To emphasize that a state e : I — )• A S is bipartite we represent it as: 

(4) 

Composition and tensoring are respectively represented as: 



■P ■ ■ (5) 

A compact structure on an object A consists of another object A* together with a pair of morphisms: 

r]A = \^_J^ -.1^ A* ^A eA = ^(""^ : A(g)A* ^l, (6) 

sometimes referred to as 'cups' and 'caps', which satisfy the 'yanking' equations: 

rJ-l in-l 

Hence we depict A by an upward arrow and A* by a downward one. We call A* the dual of A. 

A category C is a compact category (CC) ll24l |25]| if each object comes with a compact structure, 
which interact in a coherent manner ll24l l25l . For a number of reasons, including 'planarity' of the 
graphical representation of compact structures on compound objects, one usually adopts the convention 
that duals are (strictly) contravariant with respect to the tensor, that is, 

{A (g) B)* =B* ®A* and I* = I . (8) 

Cups and caps on a compound object A® B then become: 



(9) 



When we moreover have that A** = A then the direction of arrows clearly distinguishes between 
'no *' and '*'. In this case, coherence|^requires us to set 



VA* = \ J = = ^^*.^ ° "^^ = / ^ - = crA*,A , (10) 

where aA,B : A (>Si B ^ B (iSi A is the moiphism that simply swaps the objects A and B. We refer to 
such a CC as strict. In this paper all CCs will be strict. 

Remark 2.1. The over/under-crossings of wires in the pictures have no formal meaning (cf. braiding), 
but only serve to make pictures more readable. 

^This means that structural morphisms of the same type are equal. Here that is, if by means of composing and tensoring 
symmetry, (identities), cups and caps one can obtain morphisms f,g:A^Bof the same type, then these have to be equal. 
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In any CC each morphism f : A ^ B has a transpose 

f := (U. ® ee) o (U* €D / ® IbO ° (^7^ ® Ifi*) = : B* ^ A* . (11) 

Contravariance of (— )* on objects implies that 

{f^9f = 9^^f- (12) 
A CC is a dagger compact category (dCC) lUl [36l if it comes with a contravariant dagger functor 

(_)t : c°P ^ C 

that coherently preserves the compact structure. An ordinary category that comes with such a functor is 
called a dagger category (dC). 

We call the composite of the transposed and the dagger the conjugate. Explicitly, for a morphism 
f : A ^ B its conjugate is 



/ := ifY = (Is* ® ca) o (1b- ® o ivB ® Ia*) = 



■.A*-fB*. (13) 



Coherence now requires that fjA = rjA and ca = ca- In the graphical calculus this condition can be 
derived from the interpretation of the dagger as flipping pictures upside-down. 

A dagger Frobenius structure iS^O on an object A consists of an (internal) multiplication 



m 



-- : A®A^ A (14) 

which is associative, has a two-sided unit 

n = i -A-f A, (15) 
and satisfies the dagger Frobenius law. Diagrammatically these are, respectively. 



(16) 



The morphism 5 := m) = ^^""^ : A ^ A^ Ais called a comultiplication and e := v) = ^ : A ^ 1 
its counit. A dagger Frobenius structure is commutative when we have 



m = = = mo aA,A ■ (17) 

A dagger Frobenius structure admits an elegant diagrammatic calculus in terms of 'spiders' Il26ll27l 
[TTl . More precisely, one can show that any morphism 

f: A(g)...®A-^A®...(g)A (18) 



6 



obtained by composing and tensoring m, u, m\u\ I a (and also a in the case that the multiplication is 
commutative), and of which the diagrammatic representation is connected, only depends on n and m. 
We represent this unique morphism of that type as: 




(19) 



and it is then also immediately clear that these 'spiders' compose as follows: 





(20) 



This composition rule encapsulates all of the properties of a dagger Frobenius structure in Eq. ( 16 1, and 
is below referred to as the spider theorem. 

Each Frobenius structure induces a self-dual (i.e. A = A*) compact structure 



Frob 



V 



Frob 



/^"•^ := ^^■X->^ = om : A0A ^ I , 

(21) 



for which we have: 



(22) 



Because of the self-duality, we can omit the arrows in these diagrams, all of which would point upward. 
The dot in the cups and caps (where the arrows would change direction if we were to include them) 
denotes the fact that the compact structure is self-dual. By self-duality we also have A** = A, so 



coherence requires that the compact structure satisfies (cf. Eqs. ( 10 1): 



Frob 



<yA,A ° rj 



Frob 



^Frob 



^Frob 



o <yA,A ■ 



(23) 



We call such a compact structure commutative. Obviously, in the case of a commutative Frobenius 
multiplication, the induced compact structure is always commutative. 

For such a self-dual compact structure the convention of Eq. ([8]) cannot be maintained since it leads 
to A (g) i? = (A (g) B)* = B* A* = B ® A, which is easily seen to cause a collapse of the structure. 
Hence cups and caps on compound objects now have to be denoted as: 



(24) 



These arise from the canonical dagger Frobenius structure on A(>Si B given one on both A and B: 



1 1 



(25) 
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Frobenius structures have played a prominent role in characterizating classical structures within 
quantum theory. It was shown that for the dCC of finite dimensional Hilbert spaces, FdHilb, with 
linear maps as the morphisms, the tensor product as the monoidal tensor, and linear algebraic adjoint as 
the dagger, commutative dagger Frobenius structures are in bijective correspondence with orthogonal 
bases fT6l. When m o 5 = 1a, a condition referred to as specialness |26l, we obtain a bijective 
correspondence with orthonormal bases. Therefore, these Frobenius structures were called classical 
structures llT5llT3l . Explicitly, given a basis \ i G {1, . . . , n}} of a finite dimensional Hilbert space 
C", the comultiplication is the linear map that 'copies' these basis elements: 

5:C"^C"®C":: (26) 

It has also been shown that in the dCC FdHilb, noncommutative special dagger Frobenius structures 
are in bijective correspondence with noncommutative C*-algebras of linear operators acting on C" [42 1. 
From the algebraic point of view, classical structures in FdHilb are maximal commutative C*-algebras. 
Whether the Frobenius structure is commutative or not will also play an important role in distinguishing 
the theories of quantum and classical Bayesian inference. 



3 Bayesian graphical calculus 
3.1 Definition 

Consider a dSMC C in which each object comes with a dagger Frobenius structure. 

BCl For every object A G |C|, we assume the existence of a normalized state, that is, a point which 
when composed with the counit yields the morphism li : I — )• I (the identity morphism on the 
trivial object), which we depict by the 'empty picture': 

^= (27) 



A normalized state for a composite object A0 B G |C|, 



abJ^ -.l^Ai^B such that 



(28) 



will be called a. joint state. Note that the composition of a joint state on Ai^ B with the counit on 
i? is a state on A, which we call the marginal state on A, 



:l^A. 



(29) 



For most of this article, we will be concerned with just a single joint state on a set of objects together 
with its marginals. Consequently, it is adequate for our purposes to label states by the object on which 
they are defined. On the few occasions in which it will be necessary to refer to two different states on a 
single object, we will distinguish these by a prime. 

BC2 For every object A G |C|, we assume the existence of a modifier, that is, an endomorphism 

[A] -.A^A (30) 
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which is such that 



(31) 



and which is also self-transposed, that is, 



= A 



(32) 



These modifiers are calculus-specific. We give concrete examples in Sections 3.4 and 3.2 of how one 
can construct modifiers in terms of marginal states and the Frobenius multiplication. 



Proposition 3.1. Since modifiers are self-transposed they can move along cups and caps: 



(33) 



Proof: By the definition of the transpose and Eq. (22 1 we have: 



□ 



Definition 3.2. The inverse of a modifier ^ : ^4 — )• ^ is a process 

[A] -.A^A 

such that 

Hyl _ ^ _ 



(34) 



(35) 



Transpose invariance of modifiers also implies that their inverses can move along cups and caps: 





^ — FaI 





Definition 3.3. The Frobenius inverse of \a/ relative to a Frobenius multiplication is a point 

X 



I ^ A 



such that 



(36) 
(37) 



As it is the case for inverses standardly, these Frobenius inverses are easily seen to be unique. 

In our key examples, there will be marginal states and associated modifiers that do not have a 
Frobenius inverse or an inverse modifier respectively. It turns out, however, that it suffices to have a 
more general notion of inverse, namely inverses relative to a support. 
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Definition 3.4. A support for \a/ : I — vl is a self-adjoint idempotent ^ : A ^ A which is such that: 

4> _ I 

1. - , and 

2. \^ = \A/ for another self-adjoint idempotent ^ : A ^ A implies § = $ = 4 • 



that 



We say that -.A^ A is the inverse to : A ^ A relative to this support if we have 

(38) 



and that \^ : I — )• ^4 is the Frobenius inverse to \a/ : I — > A relative to this support if we have that 



-Ct^ = <^ = i . (39) 



where ^ = ^ ■ 

We can always take the support of a joint state on a composite object to be the tensor product of the 
supports of its marginals. 

Below all inverses are to be understood in this generalized sense i.e. relative to a suitable support. 
One can also incorporate the support within the Frobenius structure ^^-^^ by taking ^ as the identity. 

The reason we don't need to indicate the support explicitly in the current work is that we will 
restrict our attention to a single joint state together with the marginals and conditional states (cf . below) 
it defines and as such, we will never have need to consider states having different supports on the same 
object. 

BC3 We assume that each state admits of a Frobenius inverse relative to its support and each modifier 
admits an ordinary inverse relative to its support such that the latter is the modifier associated with 
the former: 

' A-1 = XAF . (40) 



Definition 3.5. For every joint state on a pair of objects, we can define a conditional state to be the point 

J L 



(41) 




A conditional state is such that if we compose the conditioned object (the one on the left of the 
conditional bar) with the co-unit, we obtain the unit on the conditioning object (the one of the right of 
the conditional bar) 




Definition 3.6. We call a graphical calculus with ingredients BCl, BC2, BC3 a Bayesian graphical 
calculus. 
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This definition is motivated by the fact that with notions of joint states, marginal states, conditional 
states, modifiers and inverses, we have the minimal amount of structure required to describe basic 
concepts of Bayesian inference. For example, Bayes' rule depicts as: 



= B- 



(43) 



We can straightforwardly extend the above to multiple variables A, B,C, . . .. When setting: 



BA = 



it straightforwardly foUows that: 



and that 




[ab] 




(44) 



(45) 



(46) 



Many important concepts can now be defined at this high level of generality, most notably, condi- 
tional independence (cf. Section [5] below), and many results can be derived graphically, e.g. pooling 
(cf. Section [53] below). 



3.2 Classical Bayesian graphical calculus 

Definition 3.7. A Bayesian graphical calculus is called classical if it satisfies the following equivalent 
conditions: 

(a) modifiers can move through the Frobenius structure: 




(b) modifiers are of the form: 




Proof: We first show that condition (a) implies condition (b). By the spider theorem (more specifically. 



the counit law in Eq. ( 16 1), Eq. (47 1 and Eq.(31 



The second equality in condition (b) is proven with the mirror image of this argument. Condition (b) 
implies condition (a) since again by the spider theorem (more specifically, the Frobenius law in Eq. ([T6|)), 
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The second equality in condition (a) is again proven with the mirror image of this argument. Finally, 
note that the form of this modifier is consistent with the conditions BC2 and BC3, which are defining 
conditions for a Bayesian graphical calculus. Clearly, the modifier acting on the unit gives the associ- 
ated point, Eq. ( [3T] ), and it is self-transposed, Eq. ( [321 ). Also, the consistency condition on inverses in 

Eq. (40 1, that is, the equivalence of \^ and , is automatically satisfied because 



= i 



□ 



So in classical Bayesian graphical calculi, in addition to moving along cups and caps (cf. Proposition 
3. 1 1, modifiers can move through the Frobenius structure, and hence, by the spider theorem, in a classical 
Bayesian graphical calculus modifiers can move through arbitrary spiders. 



Note that the conditions in Eq. (47 1 and Eq. (48 1 hold for states and modifiers of composite objects 
using the Frobenius structure for the latter. 

It is useful to consider some of the features of such a calculus. 

Proposition 3.8. In a classical Bayesian graphical calculus, modifiers on composite objects move 
through the Frobenius structure of one of the objects: 



(49) 











b] 












] 







Proof: We have: 






by Eq. ([76]) and Eq. ([47]) applied to A® B. The other equalities are proven similarly. 



□ 



Proposition 3.9. In a classical Bayesian graphical calculus, the Frobenius multiplication always acts 
commutatively on states, that is: 



(50) 



Multiplication is also commutative if one or both of the states are replaced by conditional states. 



Proof: By the spider theorem and Eq. (48 1 we have 
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□ 



Note that it could however still be possible that the Frobenius structure itself is not commutative, but 
just acts commutative on the joint and marginal probabilities under consideration. E.g. the Qx/2-calculus 
of Section 3.5 below when all relevant density operators commute. 

Proposition 3.10. In a classical Bayesian graphical calculus, composition of modifiers on an object is 
commutative: 



JJ 



ZA 








AX 




AX 


ZA 





rr TT 



(51) 



Proof: By Eqs. ([T6]l, (|33]), (|47]l we have: 

ll 



U 

ZA 



ZA 



ZA 






AX 



rr 



AX 

TT 



AX 



ZA 



AX 



AX 



ZA 







i 


AX 


ZA 





□ 



For a classical Bayesian calculus, conditional states have the form: 




(52) 



and B ayes' theorem, Eq. ([43|), has the form 




(53) 



By virtue of the multiplicative commutativity, the order in which the states are 'Frobenius-multiplied' 
doesn't matter (unlike the quantum generalization, as we will see). 

This is an abstract characterization of classical Bayesian inference. We now present a couple of 
concrete realizations of this calculus. We shall thereby see how the abstract characterization avoids the 
conventional elements of the concrete realizations. 



3.3 Representations of the classical Bayesian graphical calculus 

Example 3.11. Standard probability theory. Standard probability theory constitutes a special case of 
a classical Bayesian calculus. The objects are natural numbers and the morphisms from n to m are the 
m X n positive-valued matrices (consequently the points are column vectors and their daggers are row 
vectors). Composition is matrix product, and the tensor product is the matrix tensor product. It follows 
that we have 

: I ^ ^ = p := ipi,P2, ■ ■ ■ ,Pn) , (54) 
where {pi,p2, ■ ■ ■ ,Pn) denotes a column vector. The unit is 

i :I^A = u:=(l,l,...,l), (55) 
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which impHes that the co-unit must be 

T:^^I = u^ = row(l,l,...,l) 



(56) 



where T denotes matrix transposition. The counit acting on a point gives the sum of the coefficients of 
the associated vector 

^7 =u^p = ^Pj. (57) 

It follows that a normalized state in the Bayesian graphical calculus (cf. condition BCl) here corresponds 
to a positive vector with coefficients that sum to 1 : 



-.1^ A = {pi,p2, . . ■ ,Pn) such that = 1. 



(58) 



In other words, normalized states for an object are probability distributions over the set {1, . . . , n}. Nor- 
malized states on a composite object (nm) are simply probability distributions over the set {1, ... , nm}, 



\,A^J-' :I-)-A(8)S = p:= {pij\i S {1, . . .,n},j G {1, . . ■ ,m}) such that ^^Pij = 1- 

1=1 j=i 

(59) 

when the joint state is a tensor product of a state {pi,p2, ■ ■ ■ ,Pn) on A and a state {qi,q2, ■ ■ ■ , q-m) on 
B, 

X^XBZ :I^>l®B = (pigj|iG{l,...,n},iG{l,...,m}), (60) 

we say that it is uncorrelated . 

The Frobenius multiplication is the n x -n? matrix M := row (M(i) , M(2) , . . . , m(") ) where M^^) 
is the n X n matrix which is zero everywhere except at the kth diagonal element, where it is one. 



A0A^A = M:^ 



f 1 ... ... 
... 1 ... 



... \ 
... 



(61) 



\oo... 0... 00... ly 

Therefore, composing an arbitrary point on A® A with the Frobenius multiplication yields 

=M{pi^i,\i,i' e{l,...,n}) = {pi,,\ie{l,...,n}). (62) 



If the point on >1 (g) j4 is a product of a state {pi,p2, . . ■ ,Pn) on A and a different state {p[ ,P2, . . . ,Pn 
also on A, then composing with the Frobenius multiplication yields 



XAZX^ = M{pip'i,\i,i G {1, . . . ,n}) = {pip'i\i G {1, . . . ,n}). 



(63) 



This is simply the component-wise product of the input vectors. 

It follows from the above that the Frobenius co-multiplication is the v? x n matrix which is the 
matrix transpose of M. 



V 



: A^ Ai^ A = 



(64) 
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Composing an arbitrary state on A with the comultiplication yields 



where 



= M{pi\i G {1, . . . ,n}) = {pi6i^i'\i,i' G {1, . . .,n}). 

_ f 1 if i = i' 
1 if i^i' 



(65) 



(66) 



is the Kronecker delta. The comultiplication can therefore be understood as a classical broadcasting 
map [5] (see Section [33] below). It is tedious but straightforward to verify that these definitions of unit, 
co-unit, multiplication and co-multiplication yield a Frobenius structure. 

The cups and caps of the compact structure induced by this Frobenius structure are as follows: 



f^*-^: A® I = t:= {^l 



1 







and 



■.I^A®A = ft^ = {5i^i'\i,i' G {1, . . . 



(67) 



(68) 



The latter can be interpreted, modulo normalization, as the probability distribution expressing perfect 
correlation. 

We now demonstrate that standard probability theory is a representation of the classical Bayesian 
graphical calculus. 

The marginal state on A of a joint state on A® B associated with probability distribution 

{pi,j\i G {1, . . • j G {1, . . . ,m}) 
is simply the marginal distribution on A, that is, 

(69) 



If one defines the modifier associated with a state (pi, . . . ,pn) on A through Eq. (48), then it is 
represented by the n x n matrix 



( Pi 

^ = \^ = ■^^A = M[{pi,p2,...,Pn)^I]= ° 

^ 

The Frobenius inverse of a marginal state {pi, ... ,pn) on Ais the vector 

J 



^:I^^= in,... 



where 



Pi ^ if Pi / 
if Pi = 



n := 



\ 



Pn J 



(70) 



(71) 



(72) 



Furthermore, one easily verifies that the inverse of the modifier in Eq. (70 1 is simply the matrix diag{ri,r2, 
so that property BC3 is indeed satisfied. 
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It follows that the conditional state on A(Si B that arises from the joint state on A i? is simply the 
ordered set of conditional probability distributions that arise from the joint distribution pij, that is, 



J L 




A(g)B= {pi,Ai G {l,...,n},j E {l,...,m}) 



where 







-1 



if pj^O 
if pj = 



(73) 



is a probability distribution over i € {1, . . . , n}, labeled by j € {1, 
J2i Pi,j in this expression, we can infer the normalization of p^j, 



J2pi\j 

i=l 



1 for all j E {1, . . . , m}. 



(74) 

, m}. Note that because pj := 

(75) 



Consequently the composition of the conditional state with the co-unit on A is indeed the unit on B, 

[Tow{l,l,...,l)(g) I]{pi\j\i e {l,...,n},j e (76) 
= iT.Pi\j\^ G {l,...,m}) = (77) 

i 

In a slight abuse of notation, we can use A, B and C not only to denote the objects in our category 
but also to denote random variables associated with these. For instance, we take A to denote the ran- 
dom variable taking values from the set {1, . . . , n} where n is the natural number associated with the 
categorical object A. We can also follow a standard notation and write 

p{A) := {pa\ae{l,...,n}), (78) 
p{A,B) := {pa,b\aG{l,...,n},b€{l,...,m}), (79) 
p{A\B) := (p,|,|aG {l,...,n},6G (80) 

etcetera. We can then write many equations in a simple form. For instance, the Bayes' rule for classical 
Bayesian graphical calculi as in Eq. (53 1, takes the form 

p{B\A)p{A) 



p{A\B) 



p{B) 



(81) 



where this is understood to be an equality that holds component by component. 



Example 3.12. Alternative representations. Here everything is defined as it was before - objects 
are natural numbers, morphisms are positive-valued matrices, composition is the matrix product and 
tensor product is the matrix tensor product - except that the underlying notions of scalar addition and 
multiplication are modified. The new operations, denoted by ffl and □ respectively, can be defined for 
an arbitrary pair s, t of scalars as follows. For any function / that is bijective and hence invertible on 
the positive reals, they are 



t = fir'is) + r\t)), sBt = fir'is)f-\t)). 



(82) 



One easily verifies that these two operations are commutative and associative and obey the distributive 
law: 

S □ (tl ffl t2) = (S □ h) ffl (s □ t2) . 
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The unit for the new notion of addition, denoted 0^ and satisfying s ffl 0^ = s for all s, is 

Offl = /(O), (83) 

while the unit for the new notion of multiplication, denoted !□ and satisfying s □ !□ = s for all s, is 

!□ = /(I). (84) 

The new product of two matrices M and A^, denoted M \ol N, is defined accordingly: 

[M\olN],j = mk{[M]ik □ [N]kj), (85) 

as is the new tensor product of two matrices, M and N, denoted M M N, 

[MMN]ikji = [M]ijB[N]ki. (86) 

The Frobenius multiplication, co-multiplication, unit and co-unit are defined as before, but with the 
scalars and 1 replaced by 0^ and Iq. By construction, for every monotonic function /, we obtain a 
representation of a Bayesian graphical calculus. 

It is useful to consider an example of this sort of alternative to the standard probability representa- 
tion. 

Example 3.13. The negative logarithm of probability representation. Consider the case where the 
monotonic function / is the negative natural logarithm (the generalization to an arbitrary base is straight- 
forward), 

f{s) = -lns, /-l(s) = e-^ (87) 



so that 

We then haveE] 



sfflt = -ln(e"'' + e"*), sBt = s + t. (88) 

[M\olN]ij = -ln[^e-([*^l>*+[^l'=^)], (90) 

k 

[MMN]ik,ji = [M]ij + [N]M, (91) 

Offl = oo, (92) 

Iffl = 0. (93) 

Now consider a state (si, S2, • • • , Sn)- For it to be normalized, it must satisfy the condition 

(Iffl, lH)^[ol(si,S2,...,Sn) = Iffl, (94) 

which implies that 

-ln[^e"'*]=0 (95) 

k 

^e-'* = l. (96) 



k 



^As an aside, there is often a subtelty concerning inverse. The new multiplicative inverse of a scalar s, denoted s^^, must 
satisfy s □ s^^ = Iq. It follows that 

s^^ = -s. (89) 

However, the new additive inverse of a scalar s, denoted Bs must satisfy s EB Bs — Oh, which implies that Bs = s — ln(— 1), 
which is undefined. Consequently, there are no additive inverses in this new calculus. 
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It follows that the components of the vector {si, S2, ■ ■ ■ , Sn) are the negative logarithms of the compo- 
nents of a probability distribution {pi,P2, ■ ■ ■ ,Pn), 

yk : Sk = -Inpk. (97) 

In this new calculus, an impossible value of k (one for which p^ = 0) is represented by = oo, while 
a certain value (one for which pk = I) is represented by Sk = 0. 

We can represent these vectors as s{A), s{A, B), s{A\B) and so forth. We find that we have 

s{A\B) = s{A,B) - s{B), (98) 

which is understood component-wise, that is, 

siA\B) := {saib\a G {1, . . . , n}, 6 G {1, . . . , m}) (99) 

where 

Sa|6 := -lnpa|6- (100) 

The B ayes' rule takes the form 

s{A\B) = s{B\A) + s{A)-s{B). (lOI) 



One has a choice in representing degrees of belief. It can be done with probabilities, but it can 
also be accomplished with negative logarithms of probabilities, or indeed any monotonic function of 
probabilities. It is a matter of convention only which is chosen. An argument to this effect was made 
by R. T. Cox in the context of an axiomatization of Bayesian inference [Tf|. We have supported Cox's 
conclusion by demonstrating that an abstract graphical characterization of Bayesian inference shows 
certain aspects of the standard probability calculus to be merely conventional. 

Finally, note that by taking the usual inner product of the vector s{A) := (si , S2 , • • • , s„ ) of negative 
logarithms of probabilities with the vector p{A) := {pi,p2, ... ,Pn) of probabilities, one obtains the 
Shannon entropy of the probability distribution p{A), denoted S{A), 

S{A) ■.= ^PkSk = -^Pki'aPk- (102) 

k k 
One can similarly obtain the joint entropy as 

S{A,B) := J^PiJ^ij = -J^Pi'j^'^Pij^ (1^^) 

and the conditional entropy as 

S{A\B) := Y.p^,JSi\^ = -^pijlnpii,. (104) 

Noting the marginal entropy can also be obtained by averaging over the joint distribution, 

^PijSi = ^PiSi = S{A), (105) 

it follows that any expression that holds among joints, marginals and conditionals for negative loga- 
rithms of distributions (i.e. among Sij, Si, Siy etcetera) also holds among the joint, marginal and con- 



ditional entropies. For instance, Bayes' rule in terms of negative logarithms of probabilites, Eq. ( 106 1, 
implies the analogous relation among entropies 



S{A\B) = S{B\A) + S{A) - S{B). 
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(106) 



Thus the classical Bayesian graphical calculus has the power to represent relations among classical 
entropies. 

In more abstract terms one realizes this by considering the p- and the s-calculi as two distinct com- 
position and tensor structures on morphisms, above denoted by (o, (g)) and ([o] , Kl), were and M do 



coincide on objects. One then post-composes both sides in Eq. (53 1, realized in the s-calculus, with the 
normalized joint state of the p-calculus by means of the o-composition. That is, 







p-cala 










♦ 


r 






S-calc 









(107) 



In other words, a 'p-operation' 



p{A,B) 



: C{A0B,l) ^ C(I,I) 



is applied to both sides of an equation between s-terms mC{A0 B, I). Since such a p-operation can be 
applied to both sides of any equation between s-terms in classical Bayesian calculus, such an equation 
always results in a corresponding statement about classical entropies. 

3.4 Qi/2-calculus 

Particular cases of Bayesian graphical calculi arise by choosing a specific construction of the modifiers. 
Definition 3.14. A Bayesian graphical calculus is a Qi/2-calculus when modifiers are of the form: 

^ = ^^t^. (108) 

In this definition we introduced points : I — > A that are distinguished from the marginal states 
by being denoted by smaller triangles. By Eq. (31 1 and the spider theorem, these must obey: 



that is, they are the square roots of marginal states relative to the multiplication operation of the Frobe- 
nius structure. Again by the spider theorem we also have the following lemma: 



Lemma 3.15. IfEqs. ( 108 ) and {31 ) hold, and if has an inverse , then 

and 



^ = 



(109) 



and hence, the consistency condition Eq. {40) is also automatically satisfied. 

Also that modifiers are self-transposed now comes for free: 
Lemma 3.16. Modifiers of the form Eq. {108) are automatically self-transposed. 
Proof: We have: 
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where the 2nd and 3rd step use the spider theorem, and the 4th one uses commutativity of the caps. □ 



In terms of the canonical dagger Frobenius structure on A we have: 



AB = 




= r 



where: 



I I I 



■ = 



(110) 



(111) 



which follows by naturality of symmetry. 

We will assume the existence of inverses of the square roots of marginals for Qx/2-calculi, and 
consequently, by Lemma|3.15 inverses of the marginals themselves will also exist in Qi/2-calculi. 



For Q^/2"Calculi, the Bayesian update law Eq. (43 1 becomes: 




(112) 



In the final expression of Eq. ( 1 12 1, the order of the two small triangles on the left could be reversed 
because they are not connected to each other by a spider. The same is true of the two small triangles on 
the right. 

3.5 A representation of the Qi/2-calculus 

Leifer's conditional density operator calculus 1291 [30l |3T1 provides an example of a Qi/2-calculus. The 
objects are natural numbers and the morphisms from n to m are the linear maps from /^(C") to £(C") 
where £(C™) is the space of linear operators on an m-dimensional complex Hilbert space. Composition 
and tensor product of these maps are just the normal such notions. (Note that each such linear map can 
be represented as an r? x rr? matrix, in which case composition is matrix product, and the tensor product 
is the matrix tensor product.) 

We take the point \a/ to be a density operator pA G /2(C") and the point \__,,^jb,,J-' to be the 

joint density operator p^B S £(C") ® C{C"^). We take the Frobenius multiplication ^^'^^ to be 
the (non-commutative) operator product of density operators, and hence the identity operator Ia is its 
unit X ■ The co-unit f is the trace operation (which is indeed the adjoint to the identity operator when 
taken in a suitable manner L36l ). It follows that a normalized state in this graphical calculus is simply a 
density operator with trace one. 

Given that the co-unit is the trace operation and recalling that the Frobenius co-multiplication must 
satisfy co-unitality. 



V=I=V 



(113) 



we conclude that the Frobenius co-multiplication is a broadcasting operation, that is, a linear map B : 

£(C") ^ £(C") ^ £(C") satisfying 



(tr^ (g) id^) o B = idA = (id^ tr^) o B , 



(114) 



where id^ is the identity map on £(C"). Given the quantum no-broadcasting theorem Q, this mathe- 
matical operation is necessarily non-physical, i.e. it is not a completely positive map. 
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If {\i) I i G {l,...,n}} is a basis of C" so that {\i){j\ 
operator space £(C"), then B can be defined by 



i,j G {1, . . . , n}} is a basis for the 



Bi\i){j\) = J2\i){k\0\k){j\, 



(115) 



which, despite appearances, is basis-independent (In Section [6| we present a diagrammatic representa- 
tion of this operation in the dCC FdHilb.)- It is straightforward to verify that this broadcasting operation 
is associative and that the interplay between it and the operator product satisfies the Frobenius law. Con- 
sequently, we have a representation of a non-commutative dagger Frobenius structure. 

The 'cup' of the compact structure induced by this Frobenius structure is the partial transpose of the 
maximally entangled state, that is, the positive operator ^ N)OI ^ 

The Frobenius inverse of a state on A, \^ , is the inverse of the associated density operator (more 
precisely, the inverse over the support), namely p^^. The square root of a state on A, , is the square- 
root of the associated density operator, namely s/pX- The modifier [a] = .^^^^^ is the completely 
positive map ^/f>A{—)^/PA■ The consistency condition Eq. (40 1 is now also clearly satisfied. 



Given that the co-unit is the trace operation, marginal states arise by tracing out a system on a 
joint density operator and therefore correspond to reduced density operators. The conditional state 
\„,^jB^J-^ is Leifer's conditional density operator |[29l[30l . that is, a positive operator pj^^^ G C{€-'^) ® 
£(C™) such that Tr^[/9^|5] = \b- Note that the commutation of the compact structure, Eq. (23 1, 
corresponds to the cycUc property of the trace, i.e. tr(/9^/)'^) = tr(p'^pA)- 

Applying this translation to the diagrammatic Eq. ( 112| ), we obtain the Bayesian update rule as an 
identity between operators, 

(116) 



PA\B = {'^B® y/PA){y/PB ® '^a)Pb\a{\/Jb ® '^A){fB ® y/pl) 



To see that this is indeed 



where Pb\a is the conditional density operator associated with 
the translation, it is useful to consider the diagram that incorporates each element of Eq ( |116 1 explicitly, 
namely. 



(117) 




then note that the latter can be reduced to Eq. (11 12|l by application of the spider theorem. The fact that 

I I 

we require a swap map is due to our diagrammatic convention to interpret in \ A|n>^ the left wire 



as A and the right wire as B while in 



identity operators and the product symbols in Eq. ( |1 16[ ), it can be expressed as 

Pa\b 



IjAyfPB ^PB\AyfPB ^yfpA, 




it is the other way around. By leaving implicit the 



(118) 



which makes the equivalence with the diagrammatic expression Eq. ( 1 12 1 more evident. This quantum 
Bayes rule was introduced in this form in |[3TT . 

In Section[6} we show that for every dCC one can construct a model with a non-commutative Frobe- 
nius structure and from this a graphical Bayesian calculus. By doing so for the dCC FdHilb, one can 
recover the conditional density operator calculus described here. 
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4 Inferential presentation of Bayesian graphical calculus 



Above, we represented both joint and conditional states by the same triangles, only distinguishing them 
in terms of their labeling. We will now rely on the compact structure induced by the Frobenius structure 
to clearly distinguish between givens (objects on the right of the conditional bar "|" in our notation) 
and conclusions (objects on the left of the conditional) by representing the first as inputs (appearing at 
the bottom of the diagram) and the latter as outputs (appearing at the top). We do so by defining the 
following process, which we call a conditional process: 

^ (119) 



We can recover the conditional state from the conditional process by acting it upon the cup of the 
compact structure: 

= %J • (120) 

(In the context of the conditional density operator calculus, this isomorphism between conditional pro- 
cesses and conditional states corresponds to the version of the Choi-Jamiolkowski isomorphism de- 
scribed in ||29l .) For multiple givens we set: 

^:=4^ ^=^- <-) 

The normalization condition for conditional states, Eq. (|42]), is expressed in terms of conditional 
processes as 



(122) 



Using this dictionary, results that were previously expressed in terms of states may be expressed in 
terms of conditional processes. For instance, the commutativity of multiplication of conditional states 



in the classical Bayesian graphical calculus, described in Prop. (50 1 is equivalent to the commutativity 
of comultiplication of conditional processes. 

Proposition 4.1. In a classical Bayesian graphical calculus: 

(123) 





Proof: Follows from the fact that in a classical Bayesian graphical calculus, the Frobenius structure on 



states is commutative, Eq. (50 1, and from the definition of conditional processes in terms of conditional 



states, Eq. (119 1. □ 



We shall refer to the diagrammatic representation of an expression wherein every conditional state is 
replaced by its isomorphic process as the inferential presentation because by reading the diagram from 
bottom to top one follows a chain of inferences. 

Note that one should not interpret the morphisms in a Bayesian graphical calculus as transformations 
of a physical system, but as the steps of a computation that a theorist might make in reasoning about 
the physical system. It is useful to emphasize this point. The classical Bayesian graphical calculus does 
not model the evolution of random variables undergoing stochastic maps but rather the mathematical 
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operations (i.e. the belief propagation algorithm) that a statistician applies in drawing conclusions about 
one random variable from information about another. Similarly, a quantum Bayesian graphical calcu- 
lus does not model the evolution of density operators under completely positive maps (in contrast to 
the graphical calculi that have been introduced in other works e.g. ifTOl ). but rather the mathematical 
operations that a quantum theorist applies in a quantum analogue of a belief propagation algorithm. 
Bayes' rule for a general Bayesian calculus, described in Eq. ([43|), has a particularly nice form in 



the inferential presentation. We simply replace the conditional states in Eq. ( 43 1 with their associated 



modifiers using Eqs. ( 1 19 1 and ( 120 1 to obtain: 




(124) 



This form can be simplified further. One easily verifies that the morphisms 



(125) 



define another compact structure on A, which we will refer to as the modified compact structure. Note 
that, like the original compact structure, it is commutative and self-dual. 

This modified compact structure simplifies diagrams considerably. For instance, the isomorphism 



between conditional processes and conditional states, Eqs. ( 1 19 1 and ( 120 1, can be expressed elegantly 
in terms of conditional processes and joint states using the new compact structure, as follows: 



(126) 



We do not decorate the black box of the modified compact structure with the label of the modifier, 
since this label can be inferred from the object to which the black box is connected within an inferential 
scenario. 

The modified compact structure also provides a very simple formulation of Bayes' rule for general 
Bayesian calculi. It is simply the statement that AlBl is the modified transpose of BlAl : 




(127) 



It is straightforward to generalize these results to an arbitrary number of objects. For simplicity, we 
consider pairs of objects; the general case is analogous. A modifier containing a pair of object labels is 
simply the modifier defined by the joint state for those labels. One then introduces a modified compact 
structure for a pair of objects in a manner analogous to Eq. ( |125| ), namely. 



(128) 



AB 




J 




i 




AB 






= 







Our diagrammatic convention is that the objects on the left of the modifier are in the same order as 
the objects on the right. In other words, we 'hide' the crossing of wires within the black box. This 
convention maintains the diagrams as planar as possible and in the cases where non-commutativity 
plays a role, it minimizes the number of swap operations one must display simultaneously. 
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It follows, for instance, that a conditional process of the form |ABlcp| can be expressed in terms of 
the joint state ABCD using the modified compact structure on CD: 



ABICD 




(129) 



Remark 4.2. The canonical natural isomorphism ua,b -cf. ESI §6- in the diagram 



riB 



-* ® B 



riA®B 



®r]A®lB 



(g) ®A(^B 



UA,B ® lA(g)B 

is crucially non-trivial -i.e. not just aA,B- for the modified compact structure. It is 

UA,B = {^B®A ® <^a®b) ° 'S>r]A0 Ib) ° -riB 



A®B^ B®A. 



(AB)-l 



(130) 
(131) 



Remark 4.3 (generalized transposition). The transposition rule in Eq. ( 127 1 can be generalized to arbi- 
trary numbers of objects, but this requires some caution. For instance, suppose one wants to express the 
conditional process ACD\BE, which we call the target conditional,m terms of the conditional process 
ABlCDE, which we call the source conditional. It is done as follows: 



ACDIBE = 



ABICDE 



(132) 



The general prescription for how to act upon the source conditional with the modified compact structure 
to obtain the target conditional is as follows (we illustrate with our example): 



(1) one transposes all inputs into outputs: 



I ABICDE I I I I 



(133) 



(2) one transposes those of the outputs which initially were inputs and that one wants to retain as 
inputs back to inputs, together with those of the initial outputs one wants to transpose into inputs: 



ABICDE 



(134) 
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To see that this is the correct prescription, note simply that if one expresses the joint state in terms of 
the source conditional using the modified compact structure 



(135) 



and one expresses the target conditional in terms of the joint state using the modified compact structure 



^fcSiiiS = J LU ^ 1 (136) 



then by substituting Eq. (135 1 into Eq. ( 136 1, one obtains Eq. ( 132 1. 



5 Conditional independence 
5.1 Definition 

One of the most important notions in the theory of Bayesian inference is that of conditional indepen- 
dence. In classical probability theory, a set of random variables X and another set Y are said to be 
conditionally independent given a third set Z if the following equivalent conditions hold: 

(a) p{X\Y,Z)=p{X\Z) 

(b) p{Y\X,Z)=p{Y\Z) 

(c) p{X,Y\Z)=p{X\Z)p{Y\Z) 

In the general Bayesian graphical calculus, there are analogues of each of these conditions, but they 
are no longer equivalent. We therefore distinguish two pairs of notions of conditional independence. The 
first pair are the analogues of Eqs. (a) and (b) respectively, while the second pair constitute analogues 
of Eq. (c) where one differs from the other by an interchange of the roles of A and B (which yields a 
different condition due to the non-commutativity of the Frobenius structure): 

CIIl |aibc| = • ^ 
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5.2 Results 



Proposition 5.1. In a Bayesian graphical calculus, if any two of the following three equalities hold then 
the third one also holds: 



CIl; 



AIBC = 

XT 



ABIC 





'L AlC BIC — 



Al§i 
I CB |\ 



= I C B 

And similarly for the case where one interchanges A and B ( where a condition Fji is defined in the 
obvious way). 

To see this, we make use of the following lemma: 

Lemma 5.2. The condition CIIl is equivalent to 



ciiV 



lABICl = C B 
5 



T 



CB |\ 



Proof: Using Bayesian inversion together with Eq. (44 ) and CIl l, we have: 



IABICI = AIBCI 



T 



BC 



= CB \ 



Proof: [proposition 5.1 1 Since: 



J L cii; 
lABICl — I _C B 

5 



T 



I CB |\ 




(137) 



□ 



validity of any two of these equalities implies that the third also holds. The analogous equahties hold if 
one interchanges A and B □ 

It is straightforward to recover the classical notion of conditional independence, as follows. 

Proposition 5.3. In a classical Bayesian graphical calculus, the four notions of conditional indepen- 
dence, CUi, CUr, CI2l, Clin, are all equivalent 
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Proof: First, note that the equality F^^ always holds in a classical Bayesian graphical calculus. This is 

L 



proven using Eqs. ( [47] ) and ( [48] ) and the spider theorem 

J 




Similarly, one can prove the equality F/j, wherein A and B are interchanged relative to F/,. Given these 
equalities. Prop. ( [5.1| ) implies that CIl l is equivalent to C\2l and that CWr is equivalent to CYIr. Fi- 



nally, the commutativity of the comultiplication of conditional processes. Prop. (4.1 1, implies that C\2l 
and CI2/J are equivalent. Consequently, all four conditions are equivalent. □ 



What is more difficult is to recover a quantum notion of conditional independence. An open ques- 
tion is whether specifying that the form of the modifiers is as given in Eq. ( |108| ) is sufficient to prove 
everything that can be proven within the conditional density operator calculus. In particular, it is not 
clear how to derive that 012^ and CI2 r are equivalent. 

Example 5.4. In |30|, by relying on results in [22 1, which in turns rely on a Theorem by Uhlmann 
iHTI . it was established that in the case the Qi/2-calculus of Section 3.5 CIl^ implies F^,, and hence. 



by Prop. (5.1 1), CIIl is equivalent to the pair C\2l and F/,. The analogue holds if we interchange A 
and B. It would be interesting to establish whether there is a weakening to our definition of classical 
Bayesian graphical calculus which also establishes this. Note here also that the assumption made in |[30l 
Thm 3.8] to derive CHl from CIIl translates in graphical language to the condition that 



commute with 



(138) 



relative to the Frobenius multiplication, that is, for example: 




(139) 

By a tedious calculation, it can be shown that these conditions imply the weaker condition ¥l, which 
suffices for this purpose. 



5.3 Example application: generalized pooling 

A simple example of what one can derive from the notion of conditional independence, we consider the 
problem of pooling. Here, one seeks to assign a conditional state to C given A, B and the question is 
whether this state can be expressed in terms of a conditional state for C given A and a conditional state 
for C given B. In the classical case (which we shall describe below), a sufficient condition for this to 
be possible is that A and B are conditionally independent given C. We here consider an analogue for 
general Bayesian graphical calculi. 

Proposition 5.5. If A and B are conditionally independent relative to C, in the sense ofCYli, then we 
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have 



I 

|ciab| = 
TT 




(140) 



Proof: 



I 

ICIABl 

TT 



ABIC 






□ 

The case where A and B are conditionally independent relative to C in the sense of CI2/j differs by 
a swap: 



T 



CIAB = 

TT 




(141) 



Example 5.6. For Qi/2-calculi, when expressing Eq. ( 140 1 in terms of conditional states rather than in 
the inferential form we obtain: 





(142) 



For density operators, Eq. ( |142| ) is equivalent to 

p{C\AB) 



— 1 



/p{A, B) ^ p{A)^ p{B)p{C\B) p{C)-^p{C\A)^p{B)^p{A)^p{A, B) 
For classical probability distributions, we obtain 

P{A)P{B) 



P{C\AB) 



P{A,B) 



P{C\A)P{C\B) 
PiC) 



(143) 



(144) 



This result is known as the pooling formula because if A and B are conditionally independent given 
C, the posterior P{C\AB) can be reconstructed from the posteriors P{C\A) and P{C\B) and the prior 
P{C) (the dependence on A and B is inferred from normalization). As such, it is sufficient to "pool" the 
information contained in the two posteriors. Eq. ( |143 1 generalizes this to a quantum pooling formula, 
and Eq. (|140|) generalizes this further, to arbitrary Bayesian calculi. 
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5.4 The semi-graphoid axioms 

One of the reasons for identifying relationships of conditional independence among objects is to have 
the ability to describe their mutual dependencies without providing a full specification of their joint state. 
Thus, it is useful to consider what implications hold among statements of conditional independencies. 
These conditions are well known in the classical case as the semi-graphoid axioms [33|. Let U, W, X 
and Y denote sets of random variables and let X U y denote the set-theoretic union of X and Y. In a 
standard notation, I{U,W \ X) is taken to express the statement that the variables in U and the variables 
in W are conditionally independent given X. The semi-graphoid axioms, which are easily derived from 
the definition (cf. |5.1| ) of conditional independence, are: 

1. Symmetry: I{U, W\X) ^ I{W, U\X) 

2. Decomposition: I{U, W U Y\X) I{U, W\X) 

3. Weak Union: I{U, W U Y\X) ^ I{U, W\X U Y) 

4. Contraction: I{U, W\X) and I{U, Y\X UW)^ I{U, W U Y\X) 

The semi-graphoid axioms are important because their satisfaction implies the possibihty of a represen- 
tation of (certain facts about) the mutual dependencies of sets of random variables in terms of a directed 
acyclic graph known as a Bayesian network. 

It is interesting to explore the extent to which these axioms hold true for a general Bayesian graphical 
calculus when objects play the role of sets of random variables, tensor product plays the role of set- 
theoretic union, and I{A, B \ C) expresses the statement that the ordered pair of objects A, B are 
conditionally independent given C. Because we have four distinct notions of conditional independence 
in a general Bayesian graphical calculus, one can ask about the satisfaction of the axioms for any of 
these. As it turns out, few of the axioms hold for any of the notions of conditional independence 
in a general Bayesian graphical calculus. We leave for future work the question of what additional 
ingredients are required of a Bayesian graphical for the axioms to be satisfied. We note, however, 
that they are all satisfied by the classical Bayesian graphical calculus. In this sense, our formalism for 
classical Bayesian inference is at least as powerful as the graphoid axiomatization. 

Significantly, Leifer and Poulin have shown in Ref. |30| that the conditional density operator calcu- 
lus satisfies the semi-graphoid axioms, so that one may apply the tools of Bayesian networks to quantum 
belief propagation. Consequently, finding axiomatic graphical conditions implying the semi-graphoid 
axioms will presumably go hand-in-hand with finding an axiomatic graphical characterization of quan- 
tum Bayesian inference. 

If the semi-graphoid axioms are satisfied within a Bayesian graphical calculus, the topology of our 
graphical representation of a set of correlations will reproduce the topology of the Bayesian network 
(with objects being mapped to nodes, and morphisms being mapped to sets of directed edges). It is our 
hope that by understanding how Bayesian networks can be embedded within the diagrammatic calculus 
of dCCs, a bridge might be built between these two fields such that insights from one might be adapted 
to the other. 

6 Bayesian graphical calculi for arbitrary dagger compact categories 

6.1 A graphical concretely non-commutative dagger Frobenius structure 

We now provide a class of models, one for every dCC, each coming with a canonical non-commutative 
Frobenius structure that can be used to construct graphical Bayesian calculi, for example Qi/2"Calculi. 
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These include the conditional density operator calculus of Section 3.5 as a special case, namely the one 
that arises for the dCC FdHilb. The diagrammatic presentation of mixed quantum states and com- 
pletely positive maps in terms of dCCs is due to Selinger [36|. But here we cannot restrict ourselves 
to completely positive maps, since, as shown above in Section [33} the Frobenius comultiplication can- 
not be a completely positive map. In this context, the concrete graphical form of this non-completely 
positive map which we provide in this section will be insightful. 

Definition 6.1. Given a dCC C we define another dagger category D(C) as follows: 

• |D(C)| := |C| i.e. the set of objects is the same for the two dCCs; 

• D(C)(A, 5) := C{A A*,B 1^ B*) i.e. every morphism from A A* to B B* in C, is a 
morphism from A to i? in D(C); 

• composition and dagger are inherited from C via the embedding 

f A^ A(^A* 

E:B{C)^C::\ . (145) 



Since D(C) is a dCC in its own right it comes with its own graphical language. It is useful to 
see how various elements of D(C) are represented both in the graphical language of D(C) and in the 
graphical language of C. Some examples are provided in the table below. The first three columns depict 
morphisms on a single object: a general morphism, identity, and composition of two morphisms. Note 
that in the graphical language of C we adopt the convention that the dual objects will be represented by 
wires to the right of the primal objects. 



D(C) 






C 


H 11 1^ 






V 



We now consider tensor products. 
Definition 6.2. For 

fieT>{C){Ai,Bi 
we define a tensor (8)d on D(C) as 



/i«)d/2 := i^B^® CFBl,B2®B*) o {h® f2) o i^A^® CrA2®Al,Al) 



C{A,(^A*,B,(^B*) 



(146) 




(147) 
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Proposition 6.3. Recalling that f is the conjugate to f i.e. the transpose of p, an SMC-structure and 
compactness arises on D{C) from the SMC-structure and compactness ofC via the functor 



F : C ^ D(C) :: 



A 



(148) 



which maps the tensor ®ofCon the tensor (8)d o/D(C). 
Proof: This is a trivial generalization of Theorem 4.20 in |[36]| . 



□ 

We recall that (cf. Eq. ^) in the graphical language of C we adopt another useful convention: for 
the case where there is more than one object, the wires for the dual objects (in addition to appearing on 
the right) will appear in the opposite order to those of the primal objects. 

The table above presents some additional examples of elements of D(C) represented both in the 
graphical language of D(C) and in the graphical language of C, in particular, the last four columns 
depict a tensor product of morphisms, the swap (symmetry), and the cups and caps of the compact 
structure. 

Notation 6.4. To avoid confusion, below all I's, (8)'s, cr's, e's and ry's refer to the dCC C, except for when 
explicitly stated otherwise. We write / : A — >d(c) ^ for a morphism made up of these components to 
stipulate its type in the dCC D(C). 

Proposition 6.5. For every object A G |D(C)| the morphism 

T = ■.A(^bA ^d(C) a 

defined by 



T := {\a O 1a*) o ® OA-.A* 




: A® A® A* ® A* -^c A® A* (149) 



is the multiplication of a dagger Frobenius structure with unit r/A* = ^ : I — ^d(C) A defined by 

VA' = \Jf :l^cA0A*. (150) 

The following table depicts the multiplication, its unit, the comultiplication and its counit of the 
dagger Frobenius structure in the respective graphical languages of D(C) and C. 



D(C) 


A 1 


Y T 


C 




lU5n 



Because the two graphical representations of the multiplication have a similar shape, it is easy to mis- 
interpret the mapping between these. By our convention, the left leg of ^^-^^ is fiot associated with 
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the left pair of legs of 




but rather with the outermost pair of legs, while the right leg of 



the former is associated with the innermost pair of the latter. It is useful to imagine a central left-right 
partition for the diagrams in C which divides the primal objects on the left from the oppositely-ordered 
dual objects on the right. The shape of the diagram in D(C) should be compared with the left-hand side 
of the diagram in C. 

Note also that in the graphical language of D(C) we use a dot decorated by an 'F' to denote the 
Frobenius structure just defined. We do so to distinguish it from a Frobenius structure native to C 
(although we will not need to make use of such a structure in this article). 

Proof: We must verify that F is associative and satisfies the dagger Frobenius law, and that rjA* is indeed 
a two-sided unit. Representing F and rjA* in the graphical language of D(C), these properties are 



given diagrammatically as Eq. ( 16l. The tedious but straightforward proof proceeds by recasting each 
identity within the graphical language of C and verifying graph isomorphism for each. For example, 
associativity of the multiplication is verified as follows: 



D(C) 






The other properties are verified similarly. 



□ 



The above also illustrates how a non-commutative Frobenius multiplication can be constructed from 
commutative compact structures. The dagger Frobenius structure F induces a self-dual compact struc- 
ture, which depicts as follows: 



D(C) 






C 





While the Frobenius multiplication is typically non-commutative (except in the degenerate case that 
o'A,A = ^A,A, which forces C to be trivial) the induced compact structure is always commutative: 
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Definition 6.6 (Selinger f3^). A morphism / : A — 5-0(0) B in D(C) is completely positive if its 
embedding in C is of the form: 



f = {g(g)g)o (UOr/c- Ia*) = : A® A* B®B*, (151) 

for some morphism g : Ai^ C — s-c B. It is normalized if we moreover have: 

eBof= = = • ^^^^^ 

More specifically, a point e : I — )-d (C) A in D(C) is a mixed state if its embedding in C is of the form: 

(S- g) o r/c- = : I r -B 5* (153) 



e 



for some morphism g : C — t-c ^ in C. It is normalized if we moreover have: 

eA o e = = (154) 



Example 6.7. In FdHilb the concepts introduced in Definition 6.6 coincide with the usual ones; we 
explicitly establish this connection in the following section. 

It is now easy to see that the failure of complete positivity in the case of the J^-comultiplication 



(cf. Section 3.5 1 is due to the lack of symmetry between the left and the right side of the picture: 



This asymmetry is also what causes it to be non-commutative. 



(155) 



Example 6.8. Given a normalized mixed state eA...z '■ I ~^D(C) A ... Z in any such category 
D(C), the specified Frobenius structure allows one build a Q1/2 -calculus (provided the category has the 
appropriate inverses and square-roots) wherein the mixed state plays the role of the joint state. 

6.2 From operator presentation to D ( C ) -presentation 

At the convenience of the reader who is familiar with operator theory we now provide an explicit trans- 
lation of typical operator theory concepts to the diagrammatic category D(C). 

By an operator we mean an endomorphisms in p G C{A, A). Such an operator p is positive if it is 
of the form: 

p = gog'f = 

Proposition 6.9. For any object A G |C| = |D(C)|, operators C{A, A) are in bijective correspondence 
with morphisms D(C)(I, A) via the isomorphism 



U-C{A,A)^B{C)il,A)::p= ^ ^ 1^.) o yy^, = (156) 

Along this isomorphism, the positive operators in C{A,A) are in bijective correspondence with the 
mixed states in D(C)(I, A). 
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Proof: That this map is a bijection follows easily from Definition 6.1 of D(C)(I, A) and the yanking 
equations (|7]l, and that positive operators are in correspondence with mixed quantum states follows from 
the definitions of the latter, Eqs. ( 153 1 and (6.2 1, and from the definition of the conjugate, Eq. ( 13 1, to- 
gether with the yanking equations. □ 

The following proposition expresses how operations on operators in C relate to operations on the 
corresponding points in D(C) along the isomorphism ^, most notably: that composition of operators in 
C 



: C{A,A) X C{A,A) C{A,A) :: 



(157) 



corresponds to tensor product of the corresponding points in D(C) composed with the non-commutative 
dagger Frobenius multiplication F, 



^o{-^D{c) -):BiC){I,A)xB{C){I,A)^D{C){I,A) 
We also show that the partial trace of operators in C, 

tre : C{A ®B,A®B)^ C{A, A) :: ^ ^ 



"U."U)-U- 



C[A,A) 



To 



D(C)(I,^) 



and (ii) when setting tr^ as in Eq. (160), then the following diagram commutes. 



C{A ^B,A^B) D(C)(I, A^B) 



tVB 



C{A,A) 



tr 



D(C)(I,^) 



(158) 



(159) 



corresponds in D(C) to 

trg : D(C)(I, A®B)^ D(C)(I, A) :: ^ ^^^^ (160) 

Proposition 6.10. ( i) The following diagram commutes: 

C{A,A) X C{A,A) i^^iid D(C)(I,.4) x D(C)(I, A) 



(161) 



(162) 



Proof: We have: 
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□ 

Hence the caps of the compact structure in C provides the partial trace, which in D(C) becomes the 
counit of the Frobenius multipUcation. 

Definition 6.11. (361 If C is any dCC then we define CPM(C) to be the sub-dCC of D(C) which has 
the same objects as C and which has completely positive maps as morphisms. 

The beauty of both D(C) and CPM(C) is that (density) operators become points rather than op- 
erations, and that completely positive maps, rather than being mappings from (density) operators to 
(density) operators, become morphisms. Similarly, the Choi-Jamiolkowski isomorphism takes a par- 
ticularly elegant form in D(C) and CPM(C), in that it becomes a bijective correspondence between 
elements and morphisms. 

Remark 6.12. A similar presentation of the internal endomorphism monoid in arbitrary dCCs has al- 
ready appeared in the literature e.g. |[28l l42l . that is, a presentation as an object together with a non- 
commutative Frobenius structure which captures composition of endomorphisms, namely: 

where Q is now easily seen to be a dagger Frobenius structure within C itself, that is, in particular, with 
respect to the (g)-tensor. While the (8)-tensor and form of the Frobenius multiplication Q are simpler to 
manipulate, the (g)D-tensor is essential for D(C) (or CPM(C)) to be closed under tensoring [36], and 
the particular form of F is essential for it to be an internal dagger Frobenius structure within D(C). 

7 Outlook 

There is much work to be done in developing the framework outlined in this paper. For one, there are 
many concepts in classical Bayesian inference for which the quantum analogues are not yet known or 
are still poorly understood. The development of the graphical approach described here can be informed 
by progress in this area and can contribute to it|^ Also, although we have confined our attention in this 

"'it can also hope to provide some perspective on the wider body of work that seeks to interpret quantum theory as a theory 
of Bayesian inference, for instance, the program of 'quantum Bayesianism' (See |20 | and references therein) 
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work to correlations between independent systems, the conditional density operator calculus can also 
be used to describe correlations across time for a single system. This is a particularly interesting topic 
because, unlike the theory of classical Bayesian inference, there is a distinction between these kinds 
of correlations in quantum theory and such differences are important for understanding causality. The 
distinction is captured in our graphical approach by the existence of two distinct compact structures, 
the interaction between which is described by means of dualizers 1141 . Along these same lines, our 
approach should be useful for the project of finding a unified framework that incorporates retrodiction 
(inferences from a later time to an earlier time) as well as pre- and post-selection (inferences from in- 
formation at both later and earlier times). We would also like to provide an axiomatic characterization 
of the Qi/2-calculus that is analogous to our characterization of the classical Bayesian calculus, that 
is, in terms of the motion of modifiers through the Frobenius structure, and to settle the question of 
whether one can prove in the Qi/2-calculus everything that can be proved in the conditional density op- 
erator calculus. Finally, insofar as conditionning in Bayesian probability theory has strong connections 
with conditionning in a sequent calculus (which expresses the logic of provability), we expect that the 
present work will provide a stepping stone to a graphical representation of sequent calculi and to a better 
understanding of the interplay between probability theory and logic. 
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